Spectacle: Faster and more accurate chromatin state annotation using spectral learning
نویسندگان
چکیده
Recently, a wealth of epigenomic data has been generated by biochemical assays and next-generation sequencing (NGS) technologies. In particular, histone modification data generated by the ENCODE project and other large-scale projects show specific patterns associated with regulatory elements in the human genome. It is important to build a unified statistical model to decipher the patterns of multiple histone modifications in a cell type to annotate chromatin states such as transcription start sites, enhancers and transcribed regions rather than to map histone modifications individually to regulatory elements. Several genome-wide statistical models have been developed based on hidden Markov models (HMMs). These methods typically use the Expectation-Maximization (EM) algorithm to estimate the parameters of the model. Here we used spectral learning, a stateof-the-art parameter estimation algorithm in machine learning. We found that spectral learning plus a few (up to five) iterations of local optimization of the likelihood outperforms the standard EM algorithm. We also evaluated our software implementation called Spectacle on independent biological datasets and found that Spectacle annotated experimentally defined functional elements such as enhancers significantly better than a previous state-of-the-art method. Spectacle can be downloaded from https://github.com/jiminsong/Spectacle .
منابع مشابه
Accurate Annotation of Remote Sensing Images via Active Spectral Clustering with Little Expert Knowledge
It is a challenging problem to efficiently interpret the large volumes of remotely sensed image data being collected in the current age of remote sensing “big data”. Although human visual interpretation can yield accurate annotation of remote sensing images, it demands considerable expert knowledge and is always time-consuming, which strongly hinders its efficiency. Alternatively, intelligent a...
متن کاملThe Effects of Multimedia Annotations on Iranian EFL Learners’ L2 Vocabulary Learning
In our modern technological world, Computer-Assisted Language learning (CALL) is a new realm towards learning a language in general, and learning L2 vocabulary in particular. It is assumed that the use of multimedia annotations promotes language learners’ vocabulary acquisition. Therefore, this study set out to investigate the effects of different multimedia annotations (still picture annotatio...
متن کاملDeepFruits: A Fruit Detection System Using Deep Neural Networks
This paper presents a novel approach to fruit detection using deep convolutional neural networks. The aim is to build an accurate, fast and reliable fruit detection system, which is a vital element of an autonomous agricultural robotic platform; it is a key element for fruit yield estimation and automated harvesting. Recent work in deep neural networks has led to the development of a state-of-t...
متن کاملStateHub-StatePaintR: rapid and reproducible chromatin state evaluation for custom genome annotation
Genome annotation is critical to understand the function of disease variants, especially for clinical applications. To meet this need there are segmentations available from public consortia reflecting varying unsupervised approaches to functional annotation based on epigenetics data, but there remains a need for transparent, reproducible, and easily interpreted genomic maps of the functional bi...
متن کاملOp-nare120972 827..841
The ENCODE Project has generated a wealth of experimental information mapping diverse chromatin properties in several human cell lines. Although each such data track is independently informative toward the annotation of regulatory elements, their interrelations contain much richer information for the systematic annotation of regulatory elements. To uncover these interrelations and to generate a...
متن کامل